3 research outputs found
FPGA-accelerated machine learning inference as a service for particle physics computing
New heterogeneous computing paradigms on dedicated hardware with increased
parallelization, such as Field Programmable Gate Arrays (FPGAs), offer exciting
solutions with large potential gains. The growing applications of machine
learning algorithms in particle physics for simulation, reconstruction, and
analysis are naturally deployed on such platforms. We demonstrate that the
acceleration of machine learning inference as a web service represents a
heterogeneous computing solution for particle physics experiments that
potentially requires minimal modification to the current computing model. As
examples, we retrain the ResNet-50 convolutional neural network to demonstrate
state-of-the-art performance for top quark jet tagging at the LHC and apply a
ResNet-50 model with transfer learning for neutrino event classification. Using
Project Brainwave by Microsoft to accelerate the ResNet-50 image classification
model, we achieve average inference times of 60 (10) milliseconds with our
experimental physics software framework using Brainwave as a cloud (edge or
on-premises) service, representing an improvement by a factor of approximately
30 (175) in model inference latency over traditional CPU inference in current
experimental hardware. A single FPGA service accessed by many CPUs achieves a
throughput of 600--700 inferences per second using an image batch of one,
comparable to large batch-size GPU throughput and significantly better than
small batch-size GPU throughput. Deployed as an edge or cloud service for the
particle physics computing model, coprocessor accelerators can have a higher
duty cycle and are potentially much more cost-effective.Comment: 16 pages, 14 figures, 2 table
Development of an FPGA Emulator for the RD53A Test Chip
Thesis (Master's)--University of Washington, 2019In 2024 the LHC will be shut down for an extended period to perform upgrades to the instrument and the detectors located on it. One such upgrade is the Front End ITk upgrade in ATLAS Pixel. New hardware is being developed for this upgrade, and a test chip called the RD53A has been created as a prototype version of that future hardware. Primarily because of restrictions associated with the RD53A chip, including its limited availability to researchers, its inability to generate realistic data without radiation present, and the speed at which changes to the physical chip can be made when issues are found, The Adaptive Computing Machines and Emulators (ACME) lab at the University of Washington built and maintains code to emulate the RD53A. The Emulator solves the problems posed by the physical chip: The code is downloadable from an online repository and runs on a commercially available FPGA and thereby is easily accessible to researchers; It generates realistic hit data without need for irradiation; and, when the Emulator needs to be altered to fix issues or target certain research applications, turnaround can be completed in hours, not weeks. The motivation for the Emulator, its current state of development, and my personal work on the architecture and logic are explained in this thesis
CERN openlab Technical Workshop
We present Micron Inc. Deep Learning Accelerator (DLA), its software development kits, compiler and applications.
We introduce current and future DLA versions, and plans for additional software tools and support.
We also present a summary of current Micron collaboration and DLA-based activities with CERN